Back

Genome Medicine

56 training papers 2019-06-25 – 2026-03-07

Top medRxiv preprints most likely to be published in this journal, ranked by match strength.

1
Community-Driven Copy Number Variant Discovery at Scale: Results from a Rare Disease Genomics Hackathon
2025-08-12 genetic and genomic medicine 10.1101/2025.08.08.25333317
#1 (22.7%)
Show abstract

PurposeCopy number variants (CNVs) are a major contributor to rare genetic diseases, but their detection and interpretation from short-read genome sequencing (srGS) data remain challenging, especially at scale. Large amounts of existing srGS data remain under-analyzed for clinically relevant CNVs. MethodsDuring a collaborative Hackathon, we developed and applied scalable CNV analysis workflows to srGS data from three unsolved, exome-negative, rare disease cohorts: Primary Immunodeficiency (N = ...

2
Pangenome-based identification of cryptic pathogenic variants in undiagnosed rare disease patients
2025-07-11 genetic and genomic medicine 10.1101/2025.07.08.25330875
#1 (22.6%)
Show abstract

BackgroundDespite widespread implementation of exome and genome sequencing, a substantial proportion of rare disease patients remain undiagnosed due to inherent limitations in detecting structural, repetitive, and regulatory variants. MethodsWe applied long-read sequencing (LRS) to 40 individuals from 33 previously undiagnosed Korean families. De novo assemblies were integrated into a graph-based pangenome workflow, enabling sensitive detection of single-nucleotide, structural, and tandem-repea...

3
Investigating penetrance of severe combined immunodeficiency variants in an adult population cohort: implications for genomic newborn screening
2026-02-18 genetic and genomic medicine 10.64898/2026.02.17.26346478
#1 (22.4%)
Show abstract

Severe combined immunodeficiency (SCID) is a heterogeneous, recessive disorder, associated with the onset of severe, recurrent infections in the first few months of life. SCID is fatal if left untreated, but outcomes can be significantly improved by prompt diagnosis and treatment, particularly prior to onset of infection. Consequently, SCID is already included in many newborn screening programmes around the world, as well as multiple international genomic newborn screening (gNBS) research progra...

4
Functional filter for whole genome sequence data identifies stress impact, non-coding, alternate polyadenylation site variants >5kb from coding DNA
2023-05-14 genetic and genomic medicine 10.1101/2023.05.10.23289736
#1 (21.8%)
Show abstract

Despite whole genome sequencing (WGS), why do many single gene disorder cases remain unsolved, impeding diagnosis and preventative care for people whose disease-causing variants escape detection? Early WGS data analytic steps prioritize protein-coding sequences. To simultaneously prioritise variants in non-coding regions rich in transcribed and critical regulatory sequences, we developed GROFFFY, an analytic tool which integrates coordinates for regions with experimental evidence of functionalit...

5
SyMetrics: An Integrated Machine Learning Model for Evaluating the Pathogenicity of Synonymous Variants in the Human Genome
2025-03-23 genetic and genomic medicine 10.1101/2025.03.21.25324414
#1 (18.2%)
Show abstract

Synonymous single nucleotide variants (sSNVs), traditionally seen as neutral, are now recognized for their biological impact. To assess their relevance, we developed SyMetrics, a framework that integrates predictors of splicing, RNA stability, evolutionary conservation, codon usage, synonymous variation effects, sequence properties, and allele frequency. We analyzed all possible sSNVs across the human genome, and our machine-learning model achieved 97% accuracy in distinguishing deleterious from...

6
Exome copy number variant detection, analysis and classification in a large cohort of families with undiagnosed rare genetic disease
2023-10-05 genetic and genomic medicine 10.1101/2023.10.05.23296595
#1 (18.1%)
Show abstract

Copy number variants (CNVs) are significant contributors to the pathogenicity of rare genetic diseases and with new innovative methods can now reliably be identified from exome sequencing. Challenges still remain in accurate classification of CNV pathogenicity. CNV calling using GATK-gCNV was performed on exomes from a cohort of 6,633 families (15,759 individuals) with heterogeneous phenotypes and variable prior genetic testing collected at the Broad Institute Center for Mendelian Genomics of th...

7
RNA sequencing uplifts diagnostic rate in undiagnosed rare disease patients
2023-07-08 genetic and genomic medicine 10.1101/2023.07.05.23292254
#1 (18.1%)
Show abstract

BackgroundRNA-sequencing is increasingly being used as a complementary tool to DNA sequencing in diagnostics where DNA analysis has been uninformative. RNA-sequencing allows us to identify alternative splicing and aberrant gene expression allowing for improved interpretation of variants of unknown significance (VUS). Additionally, RNA-sequencing provides the opportunity not only to look at the splicing effects of known VUSs but also to scan the transcriptome for abnormal splicing events and expr...

8
Harnessing the 100,000 Genomes Project whole genome sequencing data - an unbiased systematic tool to filter by biologically validated regions of functionality
2020-04-02 genetic and genomic medicine 10.1101/2020.03.30.20047209
#1 (18.0%)
Show abstract

Whole genome sequencing (WGS) is championed by the UK National Health Service (NHS) to identify genetic variants that cause particular diseases. The full potential of WGS has yet to be realised as early data analytic steps prioritise protein-coding genes, and effectively ignore the less well annotated non-coding genome which is rich in transcribed and critical regulatory regions. To address, we developed a filter, which we call GROFFFY, and validated in WGS data from hereditary haemorrhagic tela...

9
Integration of proteomics with genomics and transcriptomics increases the diagnostic rate of Mendelian disorders
2021-03-12 genetic and genomic medicine 10.1101/2021.03.09.21253187
#1 (17.8%)
Show abstract

By lack of functional evidence, genome-based diagnostic rates cap at approximately 50% across diverse Mendelian diseases. Here, we demonstrate the effectiveness of combining genomics, transcriptomics, and, for the first time, proteomics and phenotypic descriptors, in a systematic diagnostic approach to discover the genetic cause of mitochondrial diseases. On fibroblast cell lines from 145 individuals, tandem mass tag labelled proteomics detected approximately 8,000 proteins per sample and covere...

10
CAPICE: a computational method for Consequence-Agnostic Pathogenicity Interpretation of Clinical Exome variations
2019-11-29 genetic and genomic medicine 10.1101/19012229
#1 (17.7%)
Show abstract

Exome sequencing is now mainstream in clinical practice, however, identification of pathogenic Mendelian variants remains time consuming, partly because limited accuracy of current computational prediction methods leaves much manual classification. Here we introduce CAPICE, a new machine-learning based method for prioritizing pathogenic variants, including SNVs and short InDels, that outperforms best general (CADD, GAVIN) and consequence-type-specific (REVEL, ClinPred) computational prediction m...

11
ClinSV: Clinical grade structural and copy number variant detection from whole genome sequencing data
2020-07-02 genetic and genomic medicine 10.1101/2020.06.30.20143453
#1 (17.6%)
Show abstract

Whole genome sequencing (WGS) has the potential to outperform clinical microarrays for the detection of structural variants (SV) including copy number variants (CNVs), but has been challenged by high false positive rates. Here we present ClinSV, a WGS based SV integration, annotation, prioritisation and visualisation method, which identified 99.8% of pathogenic ClinVar CNVs >10kb and 11/11 pathogenic variants from matched microarrays. The false positive rate was low (1.5-4.5%) and reproducibilit...

12
ExonViz: A website and Python package to visualize transcripts and genetic variants
2024-09-20 genetic and genomic medicine 10.1101/2024.09.18.24313945
#1 (17.5%)
Show abstract

Visualization of genes and genetic variants as well as transcript structure is essential within the human genetics community. Such illustrations represent a key tool in communicating genetic concepts and facilitating discussions on therapeutic interventions. There currently are no easily usable tools which allows the users to draw all features required for a comprehensive overview of a transcripts structure and the localisation of variants of interest. Here we introduce ExonViz, an online appl...

13
The Refined Recurrence Risk of De Novo variants Due to Parental Mosaicism
2025-07-02 genetic and genomic medicine 10.1101/2025.07.02.25330538
#1 (17.5%)
Show abstract

Parents of children with genetic disorders due to de novo variants are counselled on a recurrence risk estimate of 1-5% for further affected siblings, while the actual probability varies between 0 and 50%. This discrepancy is well known, but barely investigated. We enrolled 135 families, in which a child had been previously identified with a pathogenic seemingly de novo variant (in 140 genes). Covering two germ layers, we collected blood (n=269), buccal (n=223) and nail samples (n=223) of both p...

14
Benign-Ex: Delineating Regions of the Human Genome Benign to Copy Number Variation.
2022-10-19 genetic and genomic medicine 10.1101/2022.10.17.22280252
#1 (17.5%)
Show abstract

While copy number variants (CNVs) have been identified as an important cause of rare genetic disorders, they have also been identified in unaffected control populations, making clinical interpretation of these lesions challenging. Discriminating benign CNVs from those pathogenic for rare genetic disorders, therefore, relies on understanding what regions of the human genome are tolerant to copy number variation. Benign-Ex is a python-based program that uses information from databases of CNVs to g...

15
HiFi sequencing accurately identifies clinically relevant variants in paralogous genes
2025-10-31 genetic and genomic medicine 10.1101/2025.10.29.25339045
#1 (17.4%)
Show abstract

Short-read sequencing (SRS) methods have improved the detection of small genetic variants but remain limited in highly homologous genomic regions, such as segmental duplications with gene-pseudogene pairs. These paralogous regions often require complex, locus-specific assays for accurate analysis. Long-read genome sequencing (lrGS) technologies, such as PacBio HiFi sequencing, can span these regions but still face challenges in variant calling due to alignment ambiguities. Here, we evaluated Pac...

16
Mode and dynamics of vanA-type vancomycin-resistancedissemination in Dutch hospitals
2020-07-22 genetic and genomic medicine 10.1101/2020.07.21.20158808
#1 (17.4%)
Show abstract

BackgroundEnterococcus faecium is a commensal of the gastrointestinal tract of animals and humans but also a causative agent of hospital-acquired infections. Resistance against glycopeptides and especially to vancomycin, a first-line antibiotic to treat infections with multidrug-resistant Gram-positive pathogens, has motivated the inclusion of E. faecium in the WHO global priority list. Vancomycin resistance can be conferred by the vanA gene cluster on the transposon Tn1546, which is frequently ...

17
Estimating the prevalence of late-onset Fabry disease in the US in 2024
2024-12-14 public and global health 10.1101/2024.12.13.24319001
#1 (17.0%)
Show abstract

Fabry disease is a rare lysosomal storage condition in which sphingolipid levels build up to harmful levels in various bodily organs, eventually leading to life-threatening complications such as stroke and kidney failure. Fabry disease is caused by rare pathogenic alleles in the GLA gene on chromosome X and may present as an early or late-onset disease depending on the identity of the causal allele and the severity of its effect on the gene product. Epidemiological studies have widely varied in ...

18
Detecting pathogenic structural variation in families with undiagnosed rare disease in a national genome project
2025-08-19 genetic and genomic medicine 10.1101/2025.08.19.25333674
#1 (15.3%)
Show abstract

BackgroundWhole-genome sequencing (WGS) projects for rare disease diagnosis typically yield a diagnostic rate of approximately 25-40%, dependent particularly on patient selection and the extent of prior genetic testing. The Scottish Genomes Partnership (SGP) is a collaborative research programme involving four Scottish Regional Genetics Centres, four Scottish Medical Schools, and Genomics Englands 100,000 Genomes Project. It aims to facilitate genome sequencing and diagnosis for patients in the ...

19
Structural variant calling and clinical interpretation in 6224 unsolved rare disease exomes
2023-10-29 genetic and genomic medicine 10.1101/2023.10.28.23297720
#1 (15.3%)
Show abstract

Structural variants (SVs), including large deletions, duplications, inversions, translocations, and complex SVs have the potential to disrupt gene function resulting in rare disease. Nevertheless, current pipelines and clinical decision support systems for exome sequencing (ES) tend to focus on small alterations such as single nucleotide variants (SNVs) and insertions-deletions shorter than 50 base pairs (indels). Additionally, detection and interpretation of large copy-number variants (CNVs) ar...

20
SNPred outperforms other ensemble-based SNV pathogenicity predictors and elucidates the challenges of using ClinVar for evaluation of variant classification quality.
2023-09-08 genetic and genomic medicine 10.1101/2023.09.07.23295192
#1 (14.9%)
Show abstract

BackgroundCurrent single nucleotide variants (SNVs) pathogenicity prediction tools assess various properties of genetic variants and provide a likelihood of causing a disease. This information aids in variant prioritization - the process of narrowing down the list of potential pathogenic variants, and, therefore, facilitating diagnostics. Assessing the effectiveness of SNV pathogenicity tools using ClinVar data is a widely adopted practice. Our findings demonstrate that this conventional method ...